reaction center
Towards understanding retrosynthesis by energy-based models
Retrosynthesis is the process of identifying a set of reactants to synthesize a target molecule. It is critical to material design and drug discovery. Existing machine learning approaches based on language models and graph neural networks have achie rarely ved discussed, encouraging and rigorous results. Ho evaluations wever, the of inner these connections models are of lar these gely in models need.
Prime Implicant Explanations for Reaction Feasibility Prediction
Weinbauer, Klaus, Phan, Tieu-Long, Stadler, Peter F., Gärtner, Thomas, Malhotra, Sagar
Machine learning models that predict the feasibility of chemical reactions have become central to automated synthesis planning. Despite their predictive success, these models often lack transparency and interpretability. We introduce a novel formulation of prime implicant explanations--also known as minimally sufficient reasons--tailored to this domain, and propose an algorithm for computing such explanations in small-scale reaction prediction tasks. Preliminary experiments demonstrate that our notion of prime implicant explanations conservatively captures the ground truth explanations. That is, such explanations often contain redundant bonds and atoms but consistently capture the molecular attributes that are essential for predicting reaction feasibility.
SynCoGen: Synthesizable 3D Molecule Generation via Joint Reaction and Coordinate Modeling
Rekesh, Andrei, Cretu, Miruna, Shevchuk, Dmytro, Somnath, Vignesh Ram, Liò, Pietro, Batey, Robert A., Tyers, Mike, Koziarski, Michał, Liu, Cheng-Hao
Ensuring synthesizability in generative small molecule design remains a major challenge. While recent developments in synthesizable molecule generation have demonstrated promising results, these efforts have been largely confined to 2D molecular graph representations, limiting the ability to perform geometry-based conditional generation. In this work, we present SynCoGen (Synthesizable Co-Generation), a single framework that combines simultaneous masked graph diffusion and flow matching for synthesizable 3D molecule generation. SynCoGen samples from the joint distribution of molecular building blocks, chemical reactions, and atomic coordinates. To train the model, we curated SynSpace, a dataset containing over 600K synthesis-aware building block graphs and 3.3M conformers. SynCoGen achieves state-of-the-art performance in unconditional small molecule graph and conformer generation, and the model delivers competitive performance in zero-shot molecular linker design for protein ligand generation in drug discovery. Overall, this multimodal formulation represents a foundation for future applications enabled by non-autoregressive molecular generation, including analog expansion, lead optimization, and direct structure conditioning.